“My aim here is to contribute something to the improvement of learning in our school district; or any school district. My goal is to build a collabrative community in the education domain helping each other find better and better ways of using data to guide, improve and govern their systems of learning.

I look forward to the day when everyone at every level is comfortable sketching a simple flow diagram, trying some changes, and plotting the results to see if the changes worked. This straightforward process makes possible and encourages the study of different methods of learning and administration (inputs) and their impact on the results observed (outputs) at the local level. Those interested in using data to help improve learning in their district please feel free to contact me.”

— Dan Swart

INTRODUCTION

Everyone at every level in a school district is trying to answer important, practical questions and seek data to help them. Those questions include:


If you are a board member, district employee, or concerned citizen who asks themselves questions like these, this paper can help you. If you find little value in changing how you use data as a guide for action, this paper is definitely not for you.


This paper is for anyone striving to improve learning in their school district and who think data can help them. Anyone can use the simple theory and techniques to significantly improve their teaching, classrooms, campuses, departments and districts.

To improve our systems of learning in a rational, systematic way we need new information beyond that supplied by merely ‘description-only’ reports. Only a few simple changes are required.

Currently, neither the Texas Education Agency (TEA) nor the Texas Association of School Boards (TASB) offer training or support in the appropriate methods and tools that would give school boards and administrators access to the new information they need.

I predict that educators who learn to draw a very simple flow diagram and very simple charts can (and will) improve any results they wish to improve. These simple tools allow anyone to examine how the inputs of a learning process may (or may not) affect the outputs of the process. Although they are prettier coming from a computer, they can, quite literally, be constructed with a piece of newsprint and a crayon.

Thus, anyone can experiment on a small scale and evaluate the results before attempting large scale changes. Those results are easily shared with others, such as school boards and administrators, and form a rational basis for understanding the results observed and any proposed changes.

All that is required is an open mind, willingness to do things a little differently, and a sense of urgency for getting things done. That sounds like Texans, to me.


BACKGROUND

All school boards and superintendents work very hard to ‘keep the lights on’ and improve learning in their districts and campuses (or ‘campi’ if you prefer the Latin form, because it is funnier).

Administration of budgets for facilities, equipment, personnel, safety and legislative compliance is a function of the managerial skill of the executive team and the school board. It is usually well managed with thoughtful planning and careful attention to fiscal details. Both TEA and TASB supply quality education and support for the budgeting element of school board responsibility.

Improving learning in a district is a very different challenge and requires a different set of skills and knowledge.

Currently, there are dozens of situations where we need better ways to evaluate our data.

Metrics That Might Be of Interest
STAAR scores Survey results
District improvement plans TAPR reports
Dropout rates Pedagogy review
On-time arrivals for buses Attendance rates
Superintendent evaluations Graduation rates
Superintendent searches Accountability ratings


To make better decisions, the ‘description-only’ reports we currently use to evaluate the performance of programs, pedagogy, people, or anything else must come to us in a slightly different format than what we are used to seeing.

Any evaluation of results must go beyond what is included in ‘snapshot’ reports, or even reports with two-point or three-point comparisons. As discussed later, trends cannot be established with only two or three data points.

As it turns out, the “new” information we need is already there in our data, but it is ‘hidden’. As long as we continue to use and rely on ‘description-only’ reports this critical information will remain hidden from us.

The following report is called an Individuals Chart (part of an XmR chart pair)1 2 and it is an example of what a report looks like that includes the information necessary to create well-codified reports, charts, and documents that are concise and reliable. This chart analyzes public STAAR data provided by TEA, and includes a brief interpretation of the chart. The XmR chart provides a baseline (context) for evaluating both past and future performance.


Fig: Data taken from 5 TEA “Snapshot” description-only reports and 1 TAPR description-only report



INTERPRETING THE CHART3


With this one simple chart we are in a position to address each of the practical, important questions outlined earlier, at least for this particular STAAR category and level. Why would you settle for less? No other report even comes close.


Interesting Observation: Following the COVID pandemic we all heard about the loss-of-learning “crises” in schools. I wanted to see if such a “crises” could be detected using quantitative methods; not just personal opinion. The quantitative evidence of the XmR chart indicates that even the extreme government measures imposed during the COVID-19 pandemic apparently were not disruptive enough to the district to create even a single point outside the “normal” limits. Or, the counter-measures employed by the district were very effective.




WHAT’S WRONG WITH THE ‘DESCRIPTION-ONLY’ REPORTS WE USE NOW?

The aim of a TEA report is ‘description-only’. How many or how much. They are estimates of how many people belong to this or that category for a specified academic year. Or, to report the financial aspects of operations during the school year. Their aim is NOT to find out WHY there are so many or so few in this or that category: merely how many.

The action to be taken is on the district, campus, or people studied. It is a process designed for handing out reward and punishment.

On the other hand, school boards want to improve learning in the future.

Interest centers on future learning, not the learning of the past. For example: adopt Policy B over A, or hold on to Policy A, or try Policy C on a small scale (with further study).

The TEA collects data from every school in Texas, performs a multitude of calculations, and produces reams of ‘description-only’ reports and tables summarizing their results by academic year. These ‘description-only’ reports consist primarily of calculations mandated by the state legislature, including those designed to allow districts to fulfill their public notification requirements.

These statistical summaries are available to the public and the school districts. Unfortunately, for the public and the school districts, the information is not nearly as helpful as it could be. In fact, it takes so much ‘data wrangling’ to put it into a form that is remotely helpful for assessing and improving district and campus learning that it is simply not done.

Instead, administrators and school boards stare at endless rows and columns of mind-numbing data fragments that are ‘sliced and diced’ to an extent that even the inventor of the Chop-O-Matic would be envious!

After earnestly pouring over such ‘description-only’ reports users still have no rational basis to predict what results to expect in the future.


Here’s a partial example from a ‘description-only’ TEA report called a ‘Snapshot’:

Fig: Small sample from TEAs Snapshot descrption-only report


The entire report can be found here:

(HTML Link)


And, here’s a partial example from a TEA ‘description-only’ report called a ‘School Report Card’:


Fig: Small sample from TEAs School Report Card description-only report



The entire report can be found here:

(HTML Link)




OBSTACLES TO IMPROVEMENT

So, what is a school district to make of all these mountains of numbers grouped and mixed together? Do they help districts improve learning in the future? In their present form, the answer is definitely “NO!”

OBSTACLE #1

When it comes to improving the system of education in a district this ‘snapshot’ view of the data has several very important limitations. One is the grouping of several categories that probably represent different processes into one group. Some TEA ‘description-only’ reports combine more than one rating level into a single reporting level. This assumes that the process that produced the results in one category produces the results in them all.

For instance, one combines the ‘Approaches Grade Level’ AND the ‘Meets Grade Level’ AND the ‘Masters Grade Level’ ratings all into one combined group total and calls it the “Approaches Grade Level Standard AND HIGHER” level.

To make that category useful for improvement efforts we have to break it into the three separate levels, so each represents a single situation (process). Currently, that turns out to be quite tedious and extremely time-consuming.

The ‘Snapshot’ ‘description-only’ report offered by the TEA is a good example. My hat is off to their programmers. It is an impressive compilation of statistics all in one place. It is well-suited for the needs of the TEA and their reporting responsibilities to the legislature; but completely ill-suited for helping the concerned school board, superintendent, or even tax-paying citizen make an analysis aimed at improving future learning. One might call it a triumph of computation over value.

Although this grouping of categories permits TEA to render abstract judgments about a district or campus, the lack of specificity renders these ‘description-only’ reports unsuitable for understanding any particular process, category or level your board may be interested in monitoring or improving. (Be mindful that this is not TEA’s fault; they must follow the direction of the legislative intent.)

By analogy, it is like trying to repair your car by reading the overall repair records of similar models. What you need is diagnostic data about what is happening (or not happening) with your specific vehicle.


Here is an example of several STAAR categories grouped together in a TEA ‘description-only’ report:

Fig: The first two rows combine more than one achievement level


The section reporting STAAR scoring by ethnicity:

Fig: The first two rows combine more than one achievement level



OBSTACLE #2

These ‘description-only’ reports are ‘ill-suited’ for district improvement efforts for another reason: the TEA ‘description-only’ reports are siloed by academic year, making it nearly impossible to create a time-period-by-time-period data frame of any categories or levels the board may wish to analyze. In fact, without time-period-by-time-period data no meaningful analysis is possible.

This siloing of academic years stands as a bristling obstruction to the analysis of past performance and estimation of future results. Even the least experienced board members intuitively know the importance of numerical context when reviewing data for their district.


It was our innate desire for context that created the “this-month vs this-month-last-year vs year-to-date” presentations so popular today. “This vs that” presentations are two-point comparisons that waste more resources than all our unnecessary meetings put together.



OBSTACLE #3

“At sea one day you will smell land where there be no land.”

— Elijah, to Ishmael and Queequeg

Moby Dick (film, 1956)


Obstacle number 3 is not related to the TEA ‘description-only’ reports at all. It has to do with our human nature (or, as Jessica Rabbit put it in the 1988 film ‘Who Framed Roger Rabbit’: “I’m not bad, I’m just drawn this way!”).

For survival, humans evolved with the ability to rapidly detect patterns. That instinct can lead us astray when studying our data. Unfortunately, we quite naturally see ‘trends’ and ‘signals’ where there are none.

Consider that for any three measurements there are at least 10 completely random arrangements that can arise. And, we have created names to categorize these completely artificial ‘trends’ that we think we see:


Fig: Ten random arrangements of 3 measurements


How many of those ‘trends’ do you see in this example of TEA data placed on a bar chart?

Fig: Do you see any trends or signals?



So, we need a report that can separate the ‘common cause’ variation (routine variation) from the ‘special cause’ variation that signals a true trend, or a detectable event that changes the system.

Our current ‘description-only’ reports do not make that separation for us.


The XmR chart does.


OBSTACLE #4

Obstacle #3 is about our perceptions. Obstacle #4 is about our emotions. We have a very human impulse to intervene any time we feel something is ‘wrong’ and needs to be ‘fixed.’ The incentives for board members to seek interventions are many and varied, but it is the type of change proposed that really matters.

This is also a situation where board members must remain vigilant about getting involved in matters that are administrative (the responsibility of the superintendent) and not district policy (the responsibility of the board).

A problem arises if we react to a result that is merely random (‘common-cause’ variation) as if it were a ‘special cause’ signal. This is called ‘tampering’ and is guaranteed to make things worse. The variation we observe will increase and our good results will be diluted accordingly. At worst, we can de-stabilize a stable system that people have worked hard to develop.

This can be particularly acute when there is an incident involving misfortune or safety. Although rare, these events may be quite distracting and upsetting. However, trying to keep unpleasant events from ‘ever happening again’ is impossible and can make things worse, not better.

It is part of the leadership responsibility of boards to provide a measured and rational response when misfortune comes our way. Knowing about ‘common cause’ and ‘special cause’ variation can help us do that.


NEW INFORMATION TO HELP US

What makes these proposed new reports so powerful are three unique features:

  1. Prior period results for the process are plotted in time order, oldest to newest, left to right (the more time periods presented the better). This is called context and permits comparing our inputs to the outputs (results).

  2. An average (or median) for the entire group of results, plotted as a center line. If the process is stable, this is the best estimate for future results.

  3. One simple calculation that sets the upper and lower bounds of results we can reasonably expect in the future. This is called analysis.

All we need are just those three things; context, estimate of future performance, and analysis. All three use simple arithmetic; not even algebra. But, none of the ‘’description-only’ reports currently available have all three elements. Usually, they have none of the three.

Accordingly, those ‘description-only’ reports cannot answer the important questions. We are left with only a vague intuition about what the data are trying to tell us. All too often intuition and personal experience have proven to be unreliable guides for action on policy, performance, and pedagogy.


ANALYSIS - WHAT IS IT?

Analysis was developed to help separate the actual signals in the data from the random fluctuations that are always present in any measurement.

In this context analysis means to subject the data to ONE very simple statistical test to see if the system (or process) producing the data shows evidence of being in a stable state, with a predictable output capability.

If stable, this output capability will show as a range of values that constitute ‘normal’ for the process. Until the process itself changes in some fundamental way future measurements will wander between the expected output limits; sometimes upward, sometimes downward, sometimes the same.


For those thinking about the additional work needed to provide an anlysis like this when reporting data to the board, or to the public, or to each other please note that once the data was wrangled it took less than 5 minutes to create the XmR chart using a very inexpensive Excel add-in!


OVERCOMING THE OBSTACLES

As it turns out, creating the analytic reports we need is very simple - IF we can get our data into the right form for analysis. By reporting out results in the usual ‘description-only’ formats, neither the Texas Education Agency (TEA) nor the Texas Association of School Boards (TASB) appear to be able to help us with this particular limitation. It is up to us to take control of our own reporting needs.


GETTING BLOOD FROM THE TURNIP

After an embarrassing amount of time spent ‘data wrangling’ I was able to construct from the 5 available TEA ‘Snapshot’ ‘description-only’ reports, and 1 Texas Academic Performance Report (TAPR) ‘description-only’ report, a small time-period-by-time-period data frame for STAAR scores for all subject-matter categories at the district-level for the academic years ended 2016-2022. (No STAAR scores were available for the 2019 academic year.)

The TEA STARR Data Divided Into Constituent Parts for the subject ‘Math’

Subject = Math, with Proportions Broken Into Constituent Parts.
Year Approaches Mts Std Masters Combined Failing
2016 31% 29% 24% 84% 16%
2017 27% 31% 30% 88% 12%
2018 25% 29% 33% 87% 13%
2020 29% 27% 23% 79% 21%
2021 30% 27% 21% 78% 22%
2022 31% 31% 18% 80% 20%

An amazing amount of very useful information is now available to us using just this small data frame.


FOOLED BY RANDOMNESS - AN EXTENDED EXAMPLE

When looking at the results in the table above it would be difficult NOT to feel something about several categories, particularly the changes observed in the ‘Masters Standard’ and the ‘Failing’ columns. They seem large; and negative.

Like everyone else, I want some numerical context so I put the data in the form of bar charts by year:


Fig: Data that is not grouped together taken from 5 TEA “Snapshot” description-only reports and 1 TAPR description-only report


However, I know from training and long experience that it is dangerous to rely solely on visual inspection of data that is simply displayed and not analyzed.

The XmR charts tell us a very different story.


Fig: Data taken from 5 TEA “Snapshot” description-only reports and 1 TAPR description-only report



And, here’s one TEA does not provide us that school boards would likely be keenly interested in – the ‘FAILING’ column:

Fig: Data taken from 5 TEA “Snapshot” description-only reports and 1 TAPR description-only report



The following additional reports (XmR charts) use ungrouped data and includes the necessary information to help us. Again, they cover public STAAR data provided by TEA, along with a brief interpretation of the charts taken as a whole. They provide a baseline (context) for evaluating both past and future performance in these additional categories and levels.


Fig: Data taken from 5 TEA “Snapshot” description-only reports and 1 TAPR description-only report


INTERPRETING THE CHARTS


Technical Note For the Statistically Curious: The calculated limits are NOT set at 3 standard deviations above and below the average. The various statistical tests you learned in Statistics 101 cannot help you when your attention is focused on the future output of your district, campus, or pedagogy. They were designed to characterize a population as it exists at a single moment in time - how much or how many; without regard to why the population came to be the way that it is. Even the forecasting techniques are inappropriate for the task at hand.



LEADERSHIP REACTION

The content of your reports matter. The appropriate reaction of management (which includes the school board) is completely different if the system or process indicates a stable state (predictable) than when it shows evidence of being in an unstable state (statistically speaking, of course).

Changing a stable process in reaction to so-called performance indicators (KPIs) that fall within the ‘normal’ range of a predictable process is harmful. It makes things worse; no matter how emotional you may become if you think you see things ‘going in the wrong direction.’

Reacting to indicators falling outside the identified ‘normal’ range is productive. If the indicator is in the ‘good’ direction efforts to identify the causes and duplicate them are almost always successful. If they fall in the ‘bad’ direction, efforts to identify the causes and remove them are worthwhile.

If your reports are ‘description-only’ you cannot possibly determine the most appropriate action for the circumstances.


Cautionary Note: Districts should cease dependence on recommendations for action based on outside observational studies relying solely on correlations deemed “statistically significant”. This so-called “evidence based” research is not based on generally accepted “cause and effect” experimental design. Despite the obligatory disclaimers, calls for action often imply “cause and effect” relationships.5

Remember, theories of a flat earth were “evidence based”, as are most stereotypes. The strength and quality of the evidence is more important than whether there is any evidence at all. It is far better to see what works in your district through the use of these methods and your own efforts.



Should we do nothing when we think we need a change? ABSOLUTELY NOT! However, WHAT we do and HOW we do it is very different under each condition. Those who simply want ‘change’, without regard to whether or not the process is in a stable state, will nearly always make things worse.

BEGIN WHERE YOU ARE


SCHOOL BOARDS

Metrics and goals defined in strategic plans, district improvement plans, TEA scoring targets, safety and other district-level initiatives can and should be assessed and evaluated with the aid of the new information provided by an XmR chart. Even financial information can be analyzed this way.

SUPERINTENDENTS & ADMINISTRATORS

Legacy programs and efforts to introduce new programs aimed at improvement are all perfect candidates for this simple new chart. Insight into campus-level issues (good and bad) are best analyzed with the new information. Insight gathered this way helps answer questions like, “Is this initiative or program producing the results we want?”

INSTRUCTION

Classroom attendance, MAP scores, disciplinary events, new pedagogues, can be assessed with the aid of the new information. Anything a teacher may be concerned about can quickly and easily be tracked and analyzed.

FUNCTIONAL AREAS

Facilities managers, bus drivers, custodians can ask and answer questions such as:

SUMMARY

To go from monitoring a system to improving a system we must move from a ‘snapshot’ point of view to an ‘analysis’ point of view. They are two different jobs and require different theory, tools and methods. That means the data we use for improvement must include the context of time order, an estimate of future performance, and a range of values we can think of as ‘normal’ variation.

Fig: This “description-only” form of report cannot help us to improve


Fig: This “description-only” form of report cannot help us to improve


The crucial information for analysis of district output capability (and the prediction of future performance) is contained in the order of occurrence found in the time-period-by-time-period data. The more periods included in the analysis, the stronger the evidence for your inferences from the charts.

Management in all forms is prediction. Therefore, management should use analytic reports instead of ‘description-only’ reports. Management should focus on improvement of processes for future results, not solely on judgment of past results.


  1. The chart shown is the ‘Individuals’ chart. A complete XmR chart presentation also includes the ‘Moving Range’ chart that plots the size of the up-and-down movement between data points. The Moving Range chart is used to calculate the range of values that is considered ‘normal’ for the process (i.e., the ‘common cause’ variation present in the process).
    ↩︎

  2. The ‘XmR’ chart is one of a family of ‘control charts’ invented and developed by Walter A. Shewhart in 1929 while an engineer at Bell Telephone Laboratories. They are still the gold standard tool for understanding a process by separating ‘common cause’ variation from ‘special cause’ variation. The XmR chart is the “voice of the process” and reveals if your process is stable (predictable) or subject to a strong influence that makes it unpredictable.↩︎

  3. Since interpretation of the chart concerns future results expected of the process, the chart represents evidence that should influence our degree of belief in what the future holds. Since the future is uncertain, there is always some probability that our expectation will be wrong. Additional data helps a lot to reduce the uncertainty but will never eliminate it.↩︎

  4. Statistical techniques to detect ‘trends’ exist and require 7 or more data points to detect. A special ‘trend’ occurs when a point falls outside the range of ‘normal’ variation. This is called a ‘special’ cause of variation. The rules to detect these trends are easily learned and are built in to software that specializes in control chart preparation.↩︎

  5. See for instance https://scholarworks.umt.edu/etd/1387/↩︎